Search CORE

2 research outputs found

Fake malware opcodes generation using HMM and different GAN algorithms

Author: Trehan Harshit
Publication venue: SJSU ScholarWorks
Publication date: 25/05/2021
Field of study

Malware, or malicious software, is a program that is intended to harm systems. In the past decade, the number of malware attacks have grown and, more importantly, evolved. Many researchers have successfully integrated cutting edge Machine Learning techniques to combat this ever present and growing threat to cyber and information security. One big challenge faced by many researchers is the lack of enough data to train machine learning models and specifically deep neural networks properly. Generative modelling has proven to be very efficient at generating synthesized data that can match the actual data distribution. In this project, we aim to generate malware samples as opcode sequences and attempt to differentiate between the fake and real samples. We use different Generative Adversarial Networks (GAN) algorithms and Hidden Markov Models (HMM) to generate fake samples

SJSU ScholarWorks

Fake Malware Generation Using HMM and GAN

Author: Di Troia Fabio
Trehan Harshit
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2022
Field of study

In the past decade, the number of malware attacks have grown considerably and, more importantly, evolved. Many researchers have successfully integrated state-of-the-art machine learning techniques to combat this ever present and rising threat to information security. However, the lack of enough data to appropriately train these machine learning models is one big challenge that is still present. Generative modelling has proven to be very efficient at generating image-like synthesized data that can match the actual data distribution. In this paper, we aim to generate malware samples as opcode sequences and attempt to differentiate them from the real ones with the goal to build fake malware data that can be used to effectively train the machine learning models. We use and compare different Generative Adversarial Networks (GAN) algorithms and Hidden Markov Models (HMM) to generate such fake samples obtaining promising results

SJSU ScholarWorks